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Abstract 

We show that Trevisan's extractor and its variants |Tre011 RRV99 are secure against bounded 
quantum storage adversaries. One instantiation gives the first such extractor to achieve an output 
length Q(K — b), where K is the source's entropy and b the adversary's storage, together with 
a poly-logarithmic seed length. Another instantiation achieves a logarithmic key length, with a 
slightly smaller output length Q((K — fy/K 1 ) for any 7 > 0. In contrast, the previous best con- 
struction |Ts09] could only extract (if/6) 1 / 15 bits. Some of our constructions have the additional 
advantage that every bit of the output is a function of only a polylogarithmic number of bits from 
the source, which is crucial for some cryptographic applications. 

Our argument is based on bounds for a generalization of quantum random access codes, which 
we call quantum functional access codes. This is crucial as it lets us avoid the local list-decoding 
algorithm central to the approach in [Ts09j , which was the source of the multiplicative overhead. 
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1 Introduction 



Randomness extractors are fundamental building blocks in pseudorandomness theory, with many 
applications to derandomization, error-correcting codes, and expanders, among others. They are also 
of central importance in cryptography, where they are often used to build key generation primitives. 
In this context, one usually has the notion of an adversary, a malicious observer who is trying to 
discover a bit of the honest player's output. A prominent model for adversaries is the bounded 
storage model, introduced by Maurer [Mau92], in which the adversary is allowed to store a limited 
amount of information about the extractor's input. 

Formally, we say that a function Ext : {0, 1}^ x {0, 1}* — > {0, l} m is a (K,e) strong extractor if for 
every distribution X with min-entropy at least K (X is called the source) and uniformly random Y 
(called the seed), the distribution (Y,Ext(X, Y)) is within a statistical distance of at most e from 
the uniform distribution. The extractor is said to be secure against b bits of storage if Ext(X,Y) is 
e-close to uniform even from the point of view of an adversary who has been allowed to store b bits 
of information about X, and has also later been revealed the seed Y. 

Constructions of extractors are known that are almost-optimal in all parameters, even in the presence 
of the adversary (in fact, a result by Lu |Lu04| shows that any (K, e) strong extractor is essentially a 
[K + b,e) extractor secure against b bits of storage). Nevertheless, in a world in which no adversary 
can be trusted, Konig et al. |KMR05] introduced the following interesting twist: what if the adversary 
is allowed quantum memory? In this setting, the fundamental difficulty that arises is a familiar one, 
with a long history: how much information can be encoded in a quantum state? 

The fact that this question can admit very different answers depending on its precise formulation is 
reflected in the fact that some, but not all, classical extractor constructions are secure in the presence 
of a quantum adversary, as was demonstrated in [G KK + 07 . While many constructions have been 



shown to be sound on a case-by-case basis [KMR05( IKT0 8. FS08J ITs09| . all have parameters that 
are far from optimal either in terms of seed length or of output length. 

Central to the proof of our result are bounds on a construct which we call quantum functional access 
codes (QFAC), and we introduce them next. 



Quantum functional access codes. Holevo [Hol73] was the first to tackle the question of the 
information capacity of a quantum state, showing that one needs at least n qubits in order to encode 
n bits of information. However, this bound only holds when it is required that the whole n bits be 
recoverable from the quantum storage. As such, it is generally not applicable in a cryptographic 
context, where typically even partial information is important. Instead of asking for the whole input 
x G {0, 1}™ to be recoverable from its encoding ^(x), Ambainis et al. |ANTsV02] consider encodings 
in which it is only required that any bit of x can be recovered from \&(x) with probability 1/2+e (over 
the measurement's randomness), and they call such encodings 'random access codes' (RACs). Note 
that, since the encoding is quantum, the recoverability of any one bit does not imply the recoverability 
of the whole string x, so that Holevo's bound does not apply. Nevertheless, Ambainis et al. showed 
that RACs require essentially (1 — H(l/2 + e))n qubits to encode n bits, where H is the binary 
entropy function, providing a linear lower bound for fixed e. These bounds have proved instrumental 
in many results in information theory. In fact, as pointed out in [Ts09| . random access codes provide 
a way to construct one-bit extractors that are secure against quantum storage. 

We push this question even further: what if, instead of asking that the encoding lets us recover any 
bit of the input, we asked that it lets us recover any one out of some fixed set of functions of the 
input? For example, we could ask about encodings that let us recover the XOR of any k bits of the 
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inputo, but one can also consider more general settings. 

One might ask about the relevance of such encodings, when we already know that there are strong 
linear lower bounds on RACs — surely, these will extend to any encoding which lets us recover more 
than any single bit of the input. The key point here is that, even though both Holevo's bound and 
the RAC lower bounds are linear in the input length when the success probability p is fixed, the 
two bounds scale very differently when one considers the dependence on p: while an improvement 
on Holevo's lower bound, due to Nayak and Salzman [NS06], scales as n — logl/p, the RAC bound 
scales as (4e 2 /ln2)n for small e = 2p — 1. So we are asking, how does the minimal length of the code 
scale with the success probability, depending on the set of functions that we are trying to recover? 

Define a (n, b, e) QFAC for a set of n-bit strings A and a set of functions C from A to {0, 1} as a 
6-qubit encoding of strings x £ A such that, for any function / 6 C, one can recover fix) from the 
encoding of x with success probability 1/2 + Intuitively, the more the set of functions C is error 
resilient (i.e., the more spread-out the images (/(x))/ e c £ {0, 1}' C '), the stronger the lower bound 
should be on the length of the encoding. For example, using a simple reduction to known results we 
can show that any (n,b,e) QFAC for the set C = {f y : x i->- x ■ y mod 2, y G {0, l} n } must have 
length b > n — log 1/e. If one simply used the fact that any bit of x can be recovered from such a 
QFAC with probability 1/2 + e, the resulting bound would be the much weaker 0(s 2 n). 

We believe that QFACs constitute a primitive that should be of wide interest in studying the prop- 
erties of quantum states from an information-theoretic point of view. In this paper, we demonstrate 
the relevance of this construct by showing how good bounds on some QFACs can be used to prove the 
security of an extractor against quantum storage with almost-optimal parameters. In fact, many pre- 
vious constructions of extractors against quantum storage can be seen as implicitly proving bounds 
on QFACs. For example, the construction in [KMR05] shows that any (n, b, e) QFAC for a set of 
2-universal hashing functions must have length b > n — 2 log l/2e. 

Techniques. In this section we give an overview of our proof technique, explaining the connection 
between extractors and QFACs in the context of Trevisan's general construction paradigm |Tre01j . 
To describe this, let us first give a brief overview of the main steps that go into the proof of the 
construction by Ta-Shma [Ts09j. 

The construction starts by encoding the weakly random source x ~ X using a locally list-decodable 
code C [STV01| . This is followed by an application of the Nisan-Wigderson generator [NW94], 
interpreting C{x) as the truth table of the "hard" function. 

The proof of correctness for this construction, as the first part of ours, follows the general recon- 
struction framework of |Tre01j . For the sake of contradiction, assume that there is a test T, which 
performs a measurement on the adversary's quantum encoding ^(x) in order to distinguish the out- 
put from uniform with advantage e. A Markov argument shows that for at least an e/2 fraction of 
the samplings x from the source (call them bad samplings), T can distinguish the output (when the 
source is x) from uniform with success at least e/2. Consider any such bad sampling x. A standard 
hybrid argument, along with properties of the Nisan-Wigderson generator, allows us to construct 
a circuit T' (using little non-uniformity about x) which predicts a random position of C{x) with 
probability ^ + ^. Further, T' makes exactly one query to T. 

At this stage, we have constructed a small circuit T", which uses the adversary's quantum information 
in order to predict the bits of C{x) with some small success probability. The proof in |Ts09| shows 

1 Such codes were introduced in BARdW08], where they are called XOR-QRACs 
2 A RAC is then simply a QFAC for the set of coordinate functions /; : x \-¥ Xi. 
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how from such a circuit, one can construct another circuit which predicts any position of x with 
probability 0.99 and queries T' at most q = {m/e) c times (c = 15 for the code in [Ts09j ). This gives 
a random access code for x\ however since it makes q measurements on the quantum state ^(x), the 
no-cloning theorem forces us to see it as having a length of q ■ b qubits. The main drawback of this 
method is that the quantum state needs to be copied a large number of times in order to get a RAC 
- thus yielding a weaker bound on the output length than one might hope for. 

Our proof departs fundamentally from the usual reconstruction paradigm at this point: instead of 
using a short RAC for C{x) to construct a longer RAC for x, we give a direct analytical argument 
showing that any RAC for C(x) must be long. Note that a RAC for C(x) is simply a QFAC for the 
class of functions /j : x i— > C(x)i. Intuitively, such a QFAC cannot be short, even though its success 
probability 1/2 + e/m is small. If the QFAC is classical (CFAC), this is easy to show: assume that 
there existed a short CFAC for this problem. One can just repeat the recovery procedure to get a 
string y that agrees with C(x) at a fraction 1/2 + e/m of positions, and then one can use the good 
list-decoding properties of C to argue that the CFAC essentially lets us recover the whole input x, 
and hence must be long. In the quantum setting, however, it is far from obvious if this is true, the 
primary difficulty being that we cannot repeat the recovery procedure, since it involves measuring a 
quantum state. 

In order to overcome this difficulty, we directly prove a lower bound on the length of the QFAC derived 
from the code. This lets us derive a contradiction, proving that our extractor is safe against quantum 
storage. The idea for the lower bound consists in seeing any good QFAC as an adversary which uses 
small memory, and is able to predict codeword positions. Using the fact that list decodable codes can 
be interpreted as one-bit strong extractors, we can then use a a result by Koenig and Terhal [KT08] 
to show that such an adversary would imply a classical adversary with similar storage against a 
one-bit strong extractor, which we know does not exist. This leads to a contradiction, thus proving 
the lower bound. 

Our results We show that any extractor based on Trevisan's construction paradigm and its vari- 
ants [TreOH IRRV99] is also safe against a bounded quantum storage adversary, with almost the 
same parameters as the classical construction. Rather than give the full technical result here (see 
Theorem I4.5h . we discuss instantiations with two specific codes. 

We first use a code from [GHSZ02], which is obtained through the concatenation of the Reed-Solomon 
code and the Hadamard code. This lets us prove the following: 

Theorem 1.1 For any constants 7, c, d > 0, there is a polynomial-time computable function Ext : 
{0,1} N x {0,1}* -> {0,l} m , where t = 0(logN) and m = O(^), which is a (K,N~ C ) extractor 
against b qubits of quantum storage, for any K > N c . 

We note that the construction in |Ts09j uses the concatenation of a Reed-Muller code with the 
Hadamard code, the parameters of the Reed-Muller code being chosen so that one can do local list- 
decoding. In contrast, our analysis just needs a good list-decoding radius, but no local list-decoding 
property. Hence our result carries over to }Ts09] and in particular implies that the construction 
in [Ts09] has much better output length than the one shown in that paper, which was ^((K/b) 1 / 15 ). 

This first construction does not have the desirable property of local computability. By using a different 
code, we can also show the following: 

Theorem 1.2 For any constants a,c > 0, there is a function Ext : {0,1}^ x {0,1}' -»• {0,l} m , 
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where t = 0(log 4 N) and m = Q(aN — b), which is a (aN, N c ) extractor against b qubits of quantum 
storage. Moreover, each bit output by the extractor is computable in poly log N time. 



Even though it has a slightly larger seed length (note however that its output length is the optimal 
Q(aN — b)), a major advantage of this extractor is its simplicity: each bit of the output is simply the 
XOR of 0(log N) bits of the source, chosen based on the seed. In particular, it is locally computabkH 
On the other hand, that construction is restricted to extracting from linear entropy rates. This is 
inevitable, as lower bounds by Viola [VioQ4] show that locally computable extractors cannot extract 
from sources with entropy less than jV - 99 using a polylogarithmic seed length. 

The QFACs at the heart of this second construction are in fact the XOR-QRACs from [BARdW08 . 
A by-product of our proof is an improvement of the lower bound proved in that paper on the length 
of such codes (see Corollary 13.101) . 

A nice side feature of both these constructions, especially if one is interested in cryptographic appli- 
cations, is that it is possible to achieve an arbitrary inverse polynomial statistical distance from the 
uniform distribution, while paying only a polylogarithmic cost in terms of output length and seed 
length (this will be apparent from the more detailed statement of Theorem 11.11 given in Section UJ. 
This property was not known to hold for previous short seed extractor constructions against quantum 
storage. 



Applications to cryptography. Our results are of direct applicability to the following key ex- 
pansion scenario. Alice and Bob share a small secret uniformly random key k. They would like to 
expand it into a longer key k! in order to securely communicate in presence of an adversary Eve. A 
public source of weak randomness R (assume that R has min-entropy at least K) is available to all 
parties. When the string R is broadcast, Eve is allowed to compute an arbitrary function ^ which 
maps the input to a state on b qubits i.e., ^ : {0,1}'^' — > C 2 x2 and store the result. However, 
once she stores ^(R), her access to R is cut off. The goal is to come up with an efficient function 
Ext which can be used by Alice and Bob to compute the shared string k! = Ext(R, k). The required 
security condition is that k' is close to being uniformly random to Eve, even given her knowledge of 
^(R). In fact, we would like k! to remain random even if k is later revealed to Eve (after ^f(R) is 
computed and access to R has been cut off). 

For this application, it is important that Ext be locally computable, i.e. individual bits of the output 
should be a function of a polylogarithmic number of bits of the source R. Indeed, since we are 
putting a cap on the adversary's storage it would be unreasonable not to put a similar cap on the 
memory used by the honest parties Alice and Bob to compute bits of their shared key. 

Our second construction has the property of being locally computable: every bit of the output is a 
function of polylogarithmically many bits from the source. While various constructions of classical 
locally computable extractors are already known [DM04, Lu04, Vad04, DT09 , ours are the first to 
be proved secure against quantum adversaries. This makes them particularly suitable for use in the 
context of bounded storage cryptography. 

We note here that the results in this paper have recently been extended by Portmann, Renner, 
and the authors to show that Trevisan's extractor is secure in a broader context than that of the 
bounded-storage model [DPRV09]: they show security when one has a lower bound on the conditional 
min-entropy of the source, conditioned on the adversary's quantum information. This is a more 

3 For this to hold, we also need to check that the bits to be XOR-ed can be chosen in poly-logarithmic time, which 
is the case in this construction. 
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general assumption since it is implied by the bounded storage assumption, but the converse is not 
true in general. Proving security in this setting is crucial in a cryptographic context, as it allows 
secure composability of the extractor with other cryptographic primitives. 

Organization of the paper. We start with some preliminaries in Section [2j In Section [3] we 
introduce quantum functional access codes and give bounds for some specific families of these codes. 
In Section [J] we describe our construction and state its parameters. Finally, the proof of security is 
given in Section [5j 

2 Notation and Preliminaries 

The following notations are used throughout the paper. For x 6 {0, l} n , xi denotes the i th bit of 
x. Given two n-bit strings x,y, we let A(x,y) denote their relative Hamming distance, i.e. the 
fraction of positions at which they differ. T>b denotes the set of all density matrices on b qubits 
(complex 2 fe -dimensional positive matrices with trace 1). In general, a measurement M is described 
by a list of positive operators M a such that ^ a M^M a = Id. The probability that outcome a is 
observed when the measurement is performed on a density matrix p is then given by Ti(M a pMa). 
All logarithms are taken in base 2. Throughout, H will denote the binary entropy function H(x) = 
— xlogx — (1 — x) log(l — x) for < x < 1. We set the convention that H(0) = H(l) = 0. 

Distributions. The uniform distribution on {0, l} n is denoted by U n . We will manipulate random 
variables that have both classical and quantum parts. In general, given two classical random variables 
X, Y, XoY is the same as the random variable (X, Y). Given two states p, a, poo is just p®o. Finally, 
given a classical random variable X : Q — > {0, l} n and a quantum random variable p : f2 — > Df,, X o p 
denotes the state E, w£ q [\X (w)} (X (w)\ <g> p(w)]. The statistical distance between two distributions D\ 
and Z?2 (or, more generally, the trace distance when these distributions involve quantum components) 
is denoted by — X>2 1| - 

Definition 2.1 A (classical) distribution X is said to have min-entropy at least K (denoted H^X) > 
K) if\/x, Pr[X = x]< 2- K . 

Extractors. We first give the the formal definition of a strong extractor. 

Definition 2.2 Ext : {0, 1}^ x {0, 1}* — > {0, l} m is said to be a (K,e) strong extractor if for every 
distribution X with min-entropy at least K, we have that ||J7 m +( — Ext(X, Ut) °Ut\\ < s. Here both 
Ut 's in the second expression correspond to the same sampling. 

X is usually called the source (and N its length), while the extractor's second input is called the seed 
(of length t). 

We now extend this definition to that of a strong extractor secure against a bounded-storage quantum 
adversary. 

Definition 2.3 Ext : {0, 1}^ x {0, 1}* — > {0, l} m is said to be a (K, e) strong extractor against b 
qubits of quantum storage if for every map ^ : {0, 1}^ V b and every distribution X such that 
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H^X) > K 

\\U m o *(X) o U t - Ext(X, U t ) o *(X) oU t \\<e (1) 
where both Ut 's in the second expression correspond to the same sampling. 



We note that condition (pQ) above is equivalent to requiring that for any collection of measurements 
{M^ y ,Ml y } on V b , 



E, 



-X,y~U t 



Tr (E. 



L we{o,i} r 



Tr 



(x,y),y^ ( X )( M Ext(x,y),y) 



< £ 



Quantum codes. A (n, b) quantum encoding is a map : {0, l} n —> T>b- A fundamental theorem 
due to Holevo states that, for any fixed measurement M, the outcome of that measurement when 
performed on cannot contain more information about x than a classical string of b bits: 

Theorem 2.4 \Hol73l Let X be any distribution on {0, l} n and ^(X) = E x€X [^{x)]. For a par- 
ticular measurement M , let Ym denote the classical random variable resulting from applying the 
measurement on fy(X). If I(X : Y) denotes the mutual information of X and Y and S(^/(X)) 
denotes the von Neumann entropy of^(X), then I(X : Y) < S(^(X)). 



Oracle circuits. Our proofs of security will involve the construction of oracle circuits. If A is an 
oracle circuit, we denote by A B the circuit that uses B as the oracle. Further, let C be an oracle 
machine which uses A as an oracle (denoted by C A ). Then it is understood that when C calls A, then 
A calls the appropriate oracle B. Thus C A = C AB . We will say that a circuit C : {0, l} n +* — y {0, 1} 
computes a function / with t bits of advice if there is a string a G {0, 1}* such that for every 
x€ {0,1}", C(x,a) = f(x). 

We will use the following easy claim: 



Claim 2.5 Let B be any oracle such that oracle circuit A can be constructed using at most t\ bits 
of advice and A queries B at most q± times. Again let C be an oracle circuit which queries A and 
C can be constructed using at most t<i bits of advice. Further, C queries A at most q2 times. Then 
C can be considered as an oracle circuit which queries B at most q\q2 times and can be constructed 
using at most t\ + t<i bits of advice. 



3 Quantum Functional Access Codes 

Consider the following problem from the theory of error-correcting codes. Let C : {0, l} n — > {0, l} m 
be a code which is (e, L) list-decodable i.e. for any x G {0, l} m , there are at most L codewords y 
such that A(x,y) < \ — e. Let A = {C(x) : x G {0, 1}™} be the set of all codewords, and consider 
Enc : A — > {0, l} b , a probabilistic encoding such that for every z G A, Zi can be recovered from 
Enc(z) with probability ^ + 2s, on average over the choice of i G [m]. Given Enc(z), by performing 
the recovery procedure for every index i, we obtain a string y which will agrees with z on at least a 
7j + £ fraction of the positions with high probability. But then the exact element z can be recovered 
using just an additional log \L\ bits of advice (as per the list-decodability property of C). Hence, Enc 
can be seen as a high-probability encoding of any codeword, using only b + log \L\ bits. However, 
the obvious information-theoretic bound shows that this must be at least log \ C\ bits, implying that 
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b > log W. This is much better than the bound b > 0(e 2 log |C|/ log re), for small e, that one gets if 
there is no guarantee on the structure of the set C (see Theorem 3.2 in [Ts09] for a proof). 

To model this situation more precisely, note that the recovery procedure lets us recover any bit of C (x) 
with non-trivial probability. As such, Enc can be seen as a probabilistic encoding of every x £ {0, 1}" 
which lets us evaluate a class of functions C = {gi : x i— > C{x)i, i £ [re]}. This is a generalization of 
the usual random access codes, introduced in [ANTsV02] . for which C = C\ = {gi : x t- > Xi, i £ [n]}. 

It is natural to expect that lower bounds for this more demanding kind of random access code would 
be tighter than more general lower bounds, in a way that depends on the structure of C. We introduce 
the following definition: 

Definition 3.1 Let A C {0, l} n , and C C {/ : A — > {0,1}} a set of functions defined on A. For 
e £ (0, 1/2], a (n,b,e) quantum functional access code, or QFAC, for (A,C) is a map ^ : A — > 
such that, for every f G C, there is a measurement Mf = {M®,Mj} such that for every x £ A, 

Tr(Mf^' i $>(x)(Mf^)^) > 1/2 -he. If this first property only holds on average over the choice of f , 
then we'll say that ^ is a (n,6, e) QFAC on average for (A, C). 

The discussion above shows a strong lower bound on the length of any classical functional access 
code for a set of functions C = {gi : x h-> C(x)i, i £ [re]} that is derived from a good list-decodable 
code C. However, the classical argument cannot be extended in a straightforward manner to the 
quantum case, as it is dependent upon performing successive measurements on the encoding. If the 
encoding is quantum, the first such measurement will destroy the state, and we will not be able to 
proceed further. 

Nevertheless, for some specific cases, such bounds follow from previously known results. We start 
with the standard setting of random access codes, for which Theorem 4.1 in [ANTsV02] implies the 
following (see also Theorem 3.2 in [Ts09]): 

Lemma 3.2 Let A C {0,1}™ and e £ (0,1/2] such that there exists a (n,b,e) quantum functional 
access code for (A,d). Then log \A\ < O (^ff 5 ) 

Central to this work is the fact that functional access codes for larger classes of functions than the 
simple coordinate functions C\ enjoy much stronger lower bounds, with a weaker dependence on the 
success probability e. Konig, Maurer and Renner [KMR05J show the following: 

Theorem 3.3 ( |KMR05j . Thm. 12 and Cor. 13) Let C be the set of all functions from {0, l} n 
to {0, 1}. Then any (n, b, e) QFAC on average for (A, C) satisfies log |^4| < b + 2 log l/2e. Moreover, 
the same bound holds if C is any family of two-universal hash functions, and the decoding procedure 
is only required to be correct on average over the choice of x £ A. 

There is an obvious connection between lower bounds on the length of QFACs and lower bounds on 
one-way quantum communication complexity, even though results in the latter setting usually do not 
focus on the error dependence as much as is needed for our applications. Nevertheless, the following 
bound easily follows from known results: 

4 As noted in |Ts09] . the loss of a factor logn is inevitable. Note however that this can be removed in the case where 
A — {0, 1}" by following the proof for quantum random access codes in |ANTsV02| . 
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Lemma 3.4 Let C = {g y : x h-> x-y mod 2, y £ {0,l} n }. If there exists a (n,b,e) QFAC for (A,C), 
then \og\A\ < 6 + 21og(l/2e). 

Proof: Note that any (n, b, e) QFAC for (A, C) implies a one-way quantum protocol for the 
communication problem in which Alice is given x £ A, Bob is given y G {0, l} n , and their goal 
is to output x ■ y mod 2. Using a reduction from [CDNT98], any such protocol communicating b 
qubits and succeeding with probability 1/2 + e can be transformed into a protocol that sends any 
x G A to Bob, using b qubits, with success probability 4e 2 . Theorem 1.1 in [NS06] then shows that 
b> log -log(l/4e 2 ). ■ 

Families of two-universal hash functions over {0, l} n , as well as the Hadamard code, both have size 
f2(2 n ), which makes them unsuitable for our purposes. Indeed, in our applications to extractors we 
will use the seed to select a few random functions from the family C and apply them to the source 
in order to obtain the output. However, using any of the last two function families would require a 
seed of length linear in the source length, whereas we would like it to be poly-logarithmic. 

Our main result relies on the fact, proved below, that there are no short QFACs for families of 
functions that are defined from list-decodable codes. This extends the discussion introducing this 
section to the case of quantum encodings, and in fact we will get essentially the same bound as stated 
there — even though, as we argued was necessary, the proof will be very different. It will be useful 
to consider approximately list-decodable codes, which we define as follows: 

Definition 3.5 Let e, 5 > and L G N. A code C : {0, 1}^ — > {0, 1}^ is (e, S, L) approximately 
list-decodable if for every x G {0, 1}^, there exists at most L strings {yi\l =1 G {0, 1}^, such that for 
any string z G {0, 1}^ satisfying A(x, C(z)) < 1/2 — e, 3i G {1, . . . , L} such that A(z, yi) < 5. If C 
is (e, 0, L) approximately-list decodable then we simply say that C is (e, L) list-decodable. 

Proposition 3.6 Let s,6 > and L G N. Let C : {0,1}^ ->■ {0,1}^ be a (e/2,5,L) approximately 
list-decodable code, and C = {/, : x \-> C(x)i, i G [N]}. Let A C {0,1}^, and suppose that there 
exists a (N,b,e) QFAC on average for (A,C). Then 

log \A\ < H(5)N + b + log L + 0(log l/e) 

Moreover, this bound holds even when we only require the QFAC to have success probability 1/2 + e 
on average over the choice of x G A, instead of for all x. 

The proof crucially relies on the result by Konig and Terhal {KT08] that strong one-bit extractors are 
automatically safe against quantum adversaries, in some range of parameters. It proceeds through 
the following three steps: 

1. Show that any (e, 5, L) approximately list-decodable code C defines a good 1-bit classical strong 
extractor. 

2. Use Theorem III. 1 from [KT08] to show that the previous extractor is automatically safe against 
quantum adversaries that are allowed a bounded amount of storage. 

3. Conclude by showing how the security against quantum storage implies a lower bound on any 
QFAC on average for (A,C). 

We proceed with the details. 
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Proof: Let t = log N (assume it an integer for simplicity) and consider the following 1-bit extractor 

E: {0,l} N x{0,lY -+ {0,1} 
(x,y) H> C{x)y 

The following claim proves item 1 above. 

Claim 3.7 E : {0, 1}^ x {0, 1}* -> {0, 1} as defined above is a (K, e) strong extractor for any 
K > H(6)N + logL + logf. 

Proof: Assume for the sake of contradiction that E is not a (K, e) strong extractor. Then there 
is a distribution D with min-entropy K, and a statistical test T such that the following holds. 

| p [T[y) = C{x) y ]-\\>e 

With a possible flip in the output of circuit T, we get a new test T' such that 

P [T\y) = C{x) y ]> l - + e 

By a Markov argument, there is a set BAD C {0, 1}^ such that for every x £ BAD, 

P [T'(y) = C(x) y ] > \ + | 

and Pa;~x)[^ € SAD] > e/2. Evaluating T' on every possible y 6 {0, 1}* results in a string a;' such 
that 

P [4 = C{x)y] >\ + ^ (2) 

We can now use the (e/2, 5, L) list-decodability properties of C. For any x' satisfying {2} we can get 
a set of < L strings x 1 , . . . , x k such that at least one of them satisfies that 

P [Xy = Xy] > 1 - 5 (3) 

Note that process of finding x l , . . . ,x k need not be polynomial time, but we only require existence 
here; the important point is that the list of x l is uniquely determined by x' (take the lexicographically 
smallest list satisfying the conditions in the fact). If x 1 , . . . ,x k are known, then we require at most 
logL bits to specify % G [t] such that x 1 satisfies ([3]). Once x % is specified, we know that x must be 
among one of the at most 2 H ^ N possible iV-bit strings which are 5-close to x. Hence we require 
an additional H(S)N bits to fully specify x. Thus, the total amount of bits used to specify x is 
logL + H(5)N, which in turn implies that the size of the set BAD is bounded by L ■ 2 H ^ N . 

To conclude the argument, observe that every element in BAD is sampled with probability at most 
2~ K and hence FxeD[X € BAD] < (L ■ 2- R+H ^ N ). However, this is a contradiction if 

L-2- K+H ® N <- i.e. K > H(5)N + log L + log - 
2 s 

which gives the bound stated in the claim. H 
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Let 77 > be an error parameter, A C {0, 1}^, and Ua the uniform distribution on A. Theorem III. 1 
in [KT08] implies that, as long as 

log |^4| - b > K + logl/r), (4) 

the function E is automatically a (log \ A\, extractor that is secure against b qubits of quantum 
storage (see Definition I2.3|) . This means that, for any collection of quantum states fy(x) G T)^, 
knowledge of y and *$>(x) cannot help distinguish E(x, y) from a uniformly random bit with advantage 
more than (over the choice of x in A, and uniform y). In particular, we have that for any 
collection of measurements {M® , My} yG { iy on V^, 



E 



x£A,y£{0,iy 



Ti(My {x)y ^(x)(My (x)v ^)] < 1/2 + 3^/2 



By definition, any (N, b, e) QFAC on average for (^4, C), even one that is only correct on average over 
the choice of x, contradicts this conclusion for rj = 4e 2 /9. Hence our assumption @ on the size of 
A must be contradicted, i.e. any such QFAC must be such that log \A\ < K + b + log 9/4e 2 . Setting 
K to be the smallest possible value satisfying the condition in Claim 13.71 we get 

log \A\ < H(6)N + b + \ogL + 0(logl/e) 



We describe two instantiations of this proposition, for specific families of codes. The first one, which 
will let us get an extractor with optimal seed length, is based on the following from [GHSZ02]: 

Fact 3.8 For any N G N, e > 0, there exists a polynomial-time computable code Cr : {0, 1}^ — > 
{0, 1}^, where N = 0{N/e 4 ), that is (e,0(l/e 2 )) list-decodable. 

These codes lead to the following, the proof of which follows immediately from Proposition 13.61 

Corollary 3.9 Let Cr be the code from Fact \3.tA and Cr = {/, : x 1— > C(x)i, i G [N]}. Then any 
(N,b,e) QFAC on average for (A,Cr) is such that 

log < 6 + 0(logl/e) 

Moreover, this bound holds even when we only require the QFAC to have success probability 1/2 + e 
on average over the choice of x G A. 

Our second main construction uses a QFAC for the class Ck = {g ■ x 1— > ©„ =1 X{. , . . . , i&) G [n]}. 
QFACs for this class of functions were introduced in [BARdW08], where they are called XOR-QRACs. 
That paper shows a bound on the length of such codes using a generalization of the hypercontractive 
inequality to matrix-valued functions. We improve their result by showing the following: 

Corollary 3.10 Let k,N be integers, and e > 2k 2 /2 N . Let A C {0,1}^. If there exists a (N,b,e) 
QFAC on average for (A,Ck), then 

log \A\<b + n(± In -\ N + O {log -^j 

Moreover, this bound holds even when we only require the QFAC to have success probability 1/2 + e 
on average over the choice of x G A. 
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By generalizing the proof of Theorem 7 in [BARdW08] (which is only stated for A = {0, 1}^ in 
that paper), we can get the bound log \A\ < b + (l - ^{^e) 2 ^ + o N {l)) N for all k > log log N. 
This would lead to an extractor construction which only works for sources with min-entropy "yN for 
7 > 0.28, and our improvement on their bound gets rid of this constraint. 

Proof: The following lemma (for a reference, see [IJK06J, Lemma 42) shows that for any e > 
2k 2 /2 N , the XOR code is (e, (l/k) ln(2/e), 4/e 2 ) approximately list-decodable. 

Lemma 3.11 For every e > 2k 2 /2 N and z' 6 ({0, l} N ) k , there is a list of t < A/e 2 elements 
x 1 , . . . ,a;* € {0, 1}^ such that the following holds: for every z £ {0, 1}^ which satisfies 



p ,J z '(yi,.,y k ) =®ti%J ^ o + e 
there is an i G [t] such that 

P Wy = Zy] > 1 - 5 

y~U N 

with 5 = (l/fc)m(2/e). 

Note that in }IJK06j the lemma is proved for tuples instead of sets, and has at < 1/e 2 . However, since 
most tuples are sets, it is straightforward to get the above version for sets. Plugging the list-decoding 
parameters from this lemma in the bound of Proposition 13.61 immediately gives the result. ■ 
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4 Overview of the construction 

Our construction follows the general paradigm introduced by Trevisan [TreOl] and its subsequent 
adaptation against quantum storage by Ta-shma [Ts09| . However, our proof technique differs from 
that of [Ts09] in that it avoids constructing random access codes by copying the adversary's storage 
many times. Rather, we use the much stronger bounds on QFACs proved in Section [3l This is 
crucial in allowing us to prove an additive, rather than multiplicative, dependence of the output on 
the adversary's storage. 

We first describe a few standard tools that are used in the construction, before giving it in detail. 
Its correctness will be proved in Section [5j 

4.1 Preliminaries 

Definition 4.1 A collection of subsets Si, . . . , S m C [t] is called a (t,n,m,p) weak design if for all 
i, \Si\ = n and for all j, ^2i<j 2 |s '* nS J | < p(m - 1). 

The following theorem is due to Raz, Reingold and Vadhan [RRV99J. 

Theorem 4.2 For every m,t G N and p > 2, there is a (t,n,m,p) design which is computable in 
time (mt) 0( -V with t = O(^). 

Note that the value of t blows up when p approaches 1. In order to keep t bounded even as 
p approaches 1, we can use a construction given in [RRV99| . Even though the construction is 
computable in polynomial time, it does not meet many finer notions of efficiency which are of interest 
to us. Hartman and Raz [HR03] achieved similar parameters with a better efficiency: 
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Theorem 4.3 For every ra,t £ N such that m > n logn and < 7 < ^> there is a (t,n,m, 1 + 7) 
design such that t = 0(n 2 log ~). Further, each individual set in the design can be output in time 
polynomial in t and n. 

For the purposes of this paper, let t(n, p) denote the smallest value of t for which Theorems 14. 21 or 14.31 
guarantee the existence of a weak (i, n, m, p) design. Whether we use Theorem 14.21 or 14.31 depends 
on how small we want p to be. 

Our last tool is the Nisan-Wigderson generator with respect to a function / : {0, l} n — > {0, 1}. 

Definition 4.4 Let S\,...,S m be a (t,n,m,p) weak-design. Let x G {0, l} 2 ™. Then NW X : 
{0, 1}* -> {0, l} m is defined as 

NW x (y) = x ySi ,...,x ySm 
Here £5. denotes the restriction of x to the indices in Sj. 

4.2 Description of the construction 

Let C : {0, 1}^ — > {0, 1}^ be a code with good (possibly approximate) list-decoding capabilities, 
and (Si, . . . ,S m ) be a (t, log N, m, p) design as discussed above. Then the extractor is obtained by 
combining these two constructs in the following way: 

Ext c : {0, 1}^ x {0, 1}* -»• {0, l} m 
(x,y) 1 — y NW c{x \y) 

4.3 Main theorem 

Our main result is the following: 

Theorem 4.5 Let 5, e > 0. Let C : {0,1}^ — > {0,1}^ be a (e/m,5,L) approximately list-decodable 
code, and t = t(log N, p) such that there exists a (i, log N, m, p) design for all large enough m. Then 
for any K, b G N the function Extc : {0, 1}^ x {0, 1}* — > {0, l} m is a (K, 2s) extractor secure against 
b qubits of quantum storage, where 

_ K-b-t- H(5)N - log L - fi(log(l /e) + log N) 
m ~ TTp 

We give two instantiations of this result. The first one uses the codes from Fact 13.81 and lets us 
achieve optimal seed length. We obtain it by setting p = K" 1 ^ 2 , for any 7 > 0, and using the 
combinatorial designs guaranteed by Theorem I4.2t 

Corollary 4.6 Let 7, c, d > be any constants. Let Cr be the code obtained from Fact \3.8\ by 
setting e = N~ c . Then the function Ext Cn ■ {0, 1}^ x {0, 1}* -)■ {0, l} m , where t = O(logTV) and 
m = O (-7^); is a (K, 2e) extractor against b qubits of quantum storage for any K > N c . 

An inconvenient aspect of this construction, particularly relevant to cryptography, is that, even 
though the extractor is polynomial-time computable, it is not locally computable. Indeed, any bit 
of the output may require polynomial time to be computed, whereas one might wish for it to be 
computable in polylogarithmic time. We achieve such an extractor by taking C = Cfe : {0, 1}^ — > 
{0,l}(fc) the XOR code Ck{x) yi ^.^ yk = x yi © ... © x Vk . By using these codes together with the 
designs from Theorem 14.31 the bound from Corollary 13.101 gives the following: 
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Corollary 4.7 Let a,5,c > be any constants. Then there is a k = 0(log(m/e)/5 2 ) such that the 
function Ext Ck : {0, 1}^ x {0, 1}* -»■ {0, l} m , where t = 0(log 4 N) and m = ±((a - 25)N - b), is a 
(aN, N~ c ) extractor against b qubits of quantum storage. 

Note that this extractor is locally computable, and every individual bit of the output can be computed 
in polylogarithmic time, as the designs in Theorem 14.31 are locally computable. Note also that the 
extractor only works for linear entropy rates: as mentioned earlier, this is tight due to lower bounds 
by Viola |Vio04| on the seed length required to extract from sources with polynomially small min- 
entropy using low complexity circuits. 



5 Proof of security 

We give the proof of security of our construction. The first steps of the proof follow the general 
reconstruction paradigm from [TreOl] . and we give them first. 



5.1 Proofs in the reconstruction paradigm 

We start with the following standard observation. 



Observation 5.1 In order to prove that Ext : {0, 1}^ x {0, 1}* -»• {0, l} m is a (K, 2e) strong 
extractor against b qubits of storage, it suffices to prove that for any collection of measurements 
{M^ y ,M^y}f U y^ e iQ iym+t on T> b , and : {0,1}^ — > T>b, there are at most e2 K strings x G {0,1}^ 
such that 



E 



THE 



«e{o,i} r 



> e 



(5) 



Proof: Assume for contradiction that Ext : {0, 1}^ x {0, 1}' — > {0, l} m is not a (K, 2e) strong 
extractor against b qubits of quantum storage. By definition, there exist measurements 
{ M u,y, M u,y}(u,y)e{o,l}™+t on b qubits such that 



E 



x~X,y~U t 



Tr E 



{x,y),y> 



> 2e 



where X is the source's distribution. Since it has min-entropy at least K, it must be true that for 
at least e2 K inputs x, 



E 



y~U t 



Tr E 



L itG{0,l} m 



Ml v f(x)(Mi )t 



(x,y),y> 



> e 



Fix a collection of measurements {M^ y ,M^ y } uy( z^ 01 y m +t on V^. The previous observation shows 
that, in order to show that Ext is a strong extractor, it suffices to bound the number of strings x 
such that ([5]) holds. For this, we use the reconstruction approach in [Tre01| . For a fixed x, define 
M x : {0, l} m +* — > {0, 1} as the probabilistic procedure which, on input (it, y) G {0, l} m +* ) outputs 1 
with probability Tr(M^ y ^ (x)(M^ y ) Ji ) , where ^(x) is the state of the adversary's storage on x. For 
the most part our proofs will simply treat M x as a probabilistic oracle. Moreover, all probabilities 
that we write involving M x , or other oracle circuits making calls to M x , will implicitly be taken over 
M x 's internal randomness. 
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The first step is to use the standard hybrid argument followed by Yao's distinguisher versus predictor 
lemma to get an oracle circuit T which queries M x exactly once, and is such that T predicts Ext(x, y)i 
with some advantage over a random guess when y as well as the value of x on some related points 
are given as input. We skip the (by now, standard) argument and state the final result (see [TreOl] 
for details). 

Lemma 5.2 Let x, e be such that f5]) is satisfied, and Ext(x, y)i be the i th bit of the extractor's 
output on (x,y). Then using m + logm + 3 bits of classical advice, we can construct an oracle circuit 
T which makes one query to M x and is such that for some 1 < i < m, T satisfies: 

Pr yeUt [T M *(y,Ext(x,y)i, . . . ,Ext(x,y)i-i) = Ext(x,y)i] >\ + ^ ( 6 ) 

Our next step is to construct a small circuit R x which predicts the value of C(x) at any position y 
with some non-negligible success probability, leading to the following technical lemma: 

Lemma 5.3 Let x,e be such that |5]j is satisfied. Then using m(l + p) + logm + t + O(l) bits of 
classical advice, we can construct an oracle circuit R x which makes one query to M x and predicts 
C(x) z with probability 1/2 + e/m, on average over the choice of z G {0, 1}^. 

Proof: By Lemma 15.21 using m + log m + 3 bits of advice, we can get an oracle circuit T which 
makes exactly one query to M x and for some 1 < i < m satisfies 

P[T M * (y, C(x) , • • • , C(x) ySi ) = C(x) ySi ]>\ + ^- 

Let us split y into two parts z = ysi and w = y\t\-Si- Let ys 3 be denoted by hj(z,w). The above 
probability can then be rewritten as 

P [T M *(z,w,C(x) hl{Z)W) ,...,C(x) h ._ l{zM ) = C(x) t ] >l + - 
z,w u ' ' 2 m 

By an averaging argument, we can fix a w (using at most t bits of advice) such that the above in- 
equality holds with the probability taken over z. Let us hardwire all the possible values of C(x)f l ( z ,w) 
(for the fixed value of w), as z varies over {0, 1} W and j varies between 1 and i — 1, into the circuit 
T. By the definition of a weak design, there are at most (m — l)p bits that need to be hardwired. 
Let R x be the circuit with all the hardwired values. R x satisfies the following 

¥ [R^ {z) = c{x) z ]>\ + - (7) 
z 2 m 

The total classical advice taken so far is m + log m + t + mp + 0(1). H 



5.2 Security against quantum storage from lower bounds on QFACs 

Assume for contradiction that there is an adversary to Ext, which can distinguish its output from 
uniform given access to the seed y and some partial quantum information ^(x) G T>i, about the source. 
Such an adversary can be described by the mapping together with a collection of measurements 
{M^y, M® y } U y e iQ t i\m+t on Vb describing the adversary's measurement on his quantum information 
^>(xj, when provided with the seed and the extractor's output 

5 This describes the most general situation, as we can always assume that any measurement made by the adversary 
is done at the end of his recovery procedure. 
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For a fixed x, let M x : {0, l} m +* — y {0, 1} as in Section \5. 11 By Observation 15. 1\ to prove that Ext is 
a (X, 2e) strong extractor secure against b qubits of quantum storage, it suffices to prove that there 
are at most e2 K strings x such that © holds. 

The key conceptual step in our proof is to observe that from the circuit R x given by Lemma 15.31 we 
can construct a QFAC for the family C = {/j : x \-t C(x)i, i £ [N]} of codeword positions, and the 
set A of all x satisfying ([5]). The strong lower bounds we proved in Section [3] then let us bound the 
size of the set A as a function of the adversary's storage and the list-decoding properties of C. The 
following claim makes this connection formal. 

Claim 5.4 Let 77 > and A C {0, 1}^ be such that, for any x E A, using only c bits of classical 
advice, we can construct a circuit R x which has access to a b-qubit quantum state and is such that 
for a random y , it predicts C(x) y with probability 1 / '2 + r/ . Then the cardinality of A is at most s ■ 2 C , 
where s is the maximum size of a set B such that there exists a (N,b,rj) QFAC for (B,C). 

Proof: The c advice bits partition the set A into 2° sets A s , for s G {0, 1} C . Fix such a s and 
consider the set A s . Since s has been fixed, all x G A s have the same circuit R x ; only the 6-qubit 
quantum state ^(x) on which it operates depends on x. Hence there is a fixed set of measurements 
such that, for a random y, the measurement M y on ty(x) outputs C{x) y with probability 1/2 + 77. 
This means we have a (N, b, 77) QFAC for (A S ,C). Hence the size of A s is bounded by the maximum 
size of any set for which such a code exists. This gives us the promised bound on A. H 

To finish the proof of Theorem 14.51 note that by Proposition 13.61 any (N,b,e/m) QFAC for (A,C) 
satisfies 

log \A\ <b + H{5) N + \ogL + 0(\ogm/e) 

Applying Claim 15.41 to the advice circuit promised by Lemma 15.31 we deduce that the number 
of strings x such that {5} holds is at most 2 b+H ^ N+1 °^ L+0(\ og (m/e)) . 2m (i+ P )+\ ogm +t+o(i) _ Using 

log(m) = O(logiV), this expression can be upper-bounded by 

2b+H{8)N+m(l+p)+log L+t+O (log(l/e)+log AT) 

Using the bound on m given in Theorem 14.51 we immediately get that this expression is upper- 
bounded by e2 K , finishing the proof of the theorem. 
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